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Abstract 

It is important to explore score reliability in virtually all studies, because tests are not reliable. The 
present paper explains the most frequently used reliability estimate, coefficient alpha, so that the 
coefficient's conceptual underpinnings will be understood. Researchers need to understand score 
reliability because of the possible impact rehability has on the interpretation of research results. 
There are several common misconceptions about the basic ideas of score reliability. 
Misconceptions are formed due to lack of understanding of the concept of reliability and through 
careless speech involving statistical jargon. This paper addresses common misconceptions so 
that later discussions over score reliability will not be hindered. Misconceptions have caused 
some authors to devalue the reporting of reliability estimates in published research, while others 
report reliability coefficients inappropriately. A better understanding of score reliability can 
resolve these misconceptions and enable authors to use reliability coefficients appropriately in 
literature and speech. A background of the basic ideas of score reliability is introduced and 
concludes with an explanation of the most frequently used reliability estimate, coefficient alpha, so 
that the coefficient's conceptual underpinnings will be understood. 
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Understanding a Widely Misunderstood Statistie: Cronbaeh’s a 
Researchers often want to evaluate the importance of a study’s results by using at least 
one of the types of significance: statistical significance, practical significance, and clinical 
significance. As practical significance gains support in publications, researchers will begin to 
notice the influence that reliability has on effect sizes and statistical power against Type II error. 
Researchers need to understand score reliability because of the possible impact rehability has on 
the interpretation of research results. Thompson (1994) warns. 

The failure to consider score reliability in substantive research may exact a toll on 
the interpretations within research studies. For example, we may conduct studies 
that could not possibly yield noteworthy effect sizes given that score reliability 
inherently attenuates effects sizes. Or we may not accurately interpret the effect 
sizes in our studies if we do not consider the reliability of the scores we are actually 
analyzing, (p. 840) 

There are several common misconceptions about the basic ideas of score reliability. 
Misconceptions are formed due to lack of understanding of the concept of reliability and through 
careless speech involving statistical jargon. One should address these misconceptions to prevent 
misinterpretations of research results. A common misconception is that reliability is a 
characteristic of a test or a measurement tool; however, reliability instead is a characteristic of 
scores. Spearman (1904) introduced this characteristic by utilizing a method that measures each 
individual multiple times. In this method, Spearman determined reliability based on the 
consisteney of the individual’s seores across equivalent measurement forms. If consistency is 
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seen across measurement forms, then one can conclude that the scores are reliable. If there is no 
consistency across measurement forms, then one can conclude the scores are not reliable. The 
method in which Spearman (1904) applied shows that individual scores were tested, not the 
measurement tool. As Henson (2001) suggested, “Because scores may vary in degree of 
reliability, a given test may yield grossly divergent reliability estimates on different 
administrations” (p. 178). 

Another common misconception is that reliability is equivalent to validity. Validity 
pertains to the extent to which scores measure the intended concept. Reliability determines if the 
scores measure anything, while validity determines to what extent the scores measure the 
intended something. The relationship between validity and reliability is analogous to the 
relationship between effect size and ^calculated- For example, if a person repeatedly measured the 
same two grams of seasoning for a given recipe, consistently producing the same estimate of the 
seasoning’s weight, this may support that the scores are reliable. However, if one implies from 
the scores, “This recipe tastes great because two grams of seasoning can make anything taste 
good,” then questions of score validity may surface from the individual’s dinner guests. Scores 
must be reliable to even consider if the scores are valid, but reliability does not necessarily imply 
validity. If scores were not reliable, then one would merely be consistently measuring nothing. 
Reliability is not equivalent to validity because reliability and validity are two separate properties 
of scores. These misconceptions are demonstrated through the verbiage in journal articles 
(Thompson, 1992) and careless jargon used in informal speech (Thompson, 2003). Thompson 
(2003) exposed these misconceptions to researchers in the hope that through enlightenment 
researchers will better evaluate scores. 



Misconceptions have caused some authors to devalue the reporting of reliability estimates 
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in published research (Vacha-Haase, Henson & Caruso, 2002) while others report rehability 
coefficients inappropriately (Thompson, 1992, 2003; Wi lk inson & APA Task Force on Statistical 
Inference, 1999). A better understanding of score reliability can resolve these misconceptions 
and enable authors to use reliability coefficients appropriately in literature and speech. An 
explanation of the basic idea of score reliability and a focus on the properties of one of the most 
commonly reported reliability estimate, Cronbach’s (1951) alpha (a), will be discussed further. 

Background of Reliability 

Consistency of Scores 

Reliability pertains to the consistency of scores. The less consistency within a given 
measurement, the less useful the data may be in analysis. For example, a recipe calls for two 
grams of a seasoning. The package of seasoning states the contents of the package includes two 
grams of seasoning. To begin cooking, one measures two grams of seasoning that does not seem 
to use the entire package. Curiosity sets in and the person decides to measure the seasoning 
again. To the person’s surprise, the second measure indicates a score more than two grams of 
seasoning. Stubbornly, the person measures the seasoning a third time and notices a score less 
than two grams of seasoning. Baffled at these results, one may begin to question ah of the scores 
produced by the measuring tool. The person concludes the measurement tool does not measure 
anything. When measurement tools generate random scores, the scores are not reliable. On the 
other hand, suppose the person measured the seasoning multiple times and generated two grams 
each time; this set of scores would then be considered reliable. 

Types of Reliability 

There are several coefficients to estimate the reliability of scores, such as internal 
consistency, test-retest, and form equivalence coefficients. Each type of coefficient estimates 
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consistency across different parameters. Internal consistency coefficients estimate the degree in 
which scores measure the same concept. To put this in context of the cooking example, the 
individual is testing the weight of the seasoning instead of the chemical composition or pH of the 
seasoning. Test-retest coefficients estimate stability of scores over a period of time. Form 
equivalence coefficients estimate consistency of scores between two test forms. Internal 
consistency coefficients are convenient to calculate because such coefficients require only a single 
measurement given at one time. Internal consistency coefficients are more practical than other 
reliability coefficients due to the lack of time and resources to perform the multiple tests seen in 
test-retest coefficients and the multiple formats seen in form equivalence coefficients. There is no 
preference for a single method. A method should be selected based on the context of the research 
being conducted. 

Properties of Cronbach’s Alpha (a) 

There are various types of reliability coefficients. Cronbach’s (1951) alpha is one of the 
most commonly used reliability coefficients (Hogan, Benjamin & Brezinksi, 2000) and for this 
reason the properties of this coefficient will be emphasized here. 

Type of Reliability Coefficient 

One property of alpha (Cronbach, 1951) is it is one type of internal consistency coefficient. 
Before alpha, researchers were limited to estimating internal consistency of only dichotomously 
scored items using the KR-20 formula. Cronbach’s (1951) alpha was developed based on the 
necessity to evaluate items scored in multiple answer categories. Cronbach (195 1) derived the alpha 
formula from the KR-20 formula: 



KR - 20 = K / (K - 1) [1 - <T, 



( 1 ) 
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where K is the number of items, p,^ is the proportion of people with a seore of 1 on the Mh item, 

is the proportion of people with a seore of 0 on the kih item, and is the varianee of seores 

on the total measurement, to inelude both diehotomously and polyehotomously seored items. 

Calculating Alpha (a) 

Alpha is ealeulated using the following formula: 

a = K/(K-l)[l-(Xo-7'"»7)l. (2) 

E 2 2 

<7 ^ is the sum of the k item seore varianees, and is the 

varianee of seores on the total measurement. By comparing both equations, one can see that the 

only difference between the two formulas is numerators, Z and Z cr^ . The two numerators 

are computationally equivalent when items are diehotomously scored (Thompson, 2003). One way 
to calculate alpha, is to use a statistical software program such as SPSS. Select Analyze > Scale > 
Reliability Analysis. Next, select the scores you wish to analyze. Finally, select paste to paste the 
below syntax into your syntax file and run. 

RELIABILITY 
A^ARIABLES=X1 X2 X3 
/SCAEE(AEE VARIABEES) AEE 
/MODEE=AEPHA 
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Ratio of Variances 

Another property of alpha (Cronbach, 1951) is it is a ratio of variances that follows the 
general linear model (GLM). In the term, ^ cr^ ^ there is division of the true score 

variance by the total score variance. Given that variance is a squared metric statistic, when we 

E 2 2 

a ^ ) by a squared metric statistic (i.e. ) the result will 

also be in a squared metric. One misconception about alpha is that alpha can only be positive 
because alpha is a squared metric statistic. However, computationally, alpha can be negative. 

When alpha is negative the integrity of the scores should be severely questioned (Thompson, 2003). 
A negative alpha is a symptom of two differential diagnoses: 1) an incorrect measurement model or 
2) very bad scores. Alpha is a direct analog of effect size, r^, due to the nature of variance- 
accounted-for effect sizes such as r^, R^, and r\ (Thompson, 2003). Alpha takes into consideration 
the correlation between item scores. More directly, alpha is the square of the correlation between 
true score variance and total score variance. The degree of correlation and the direction of the 
relationship will help explain how a negative alpha can be generated. Consider three possible 
scenarios: alpha equal to zero, alpha equal to one, and alpha equal to a negative value. These 
heuristic examples have been adapted from Henson (2001) and Thompson (2003). 

Scenario #1: all item score correlations are perfectly uncorrelated. According to the 
formula for alpha, alpha can be calculated if we have the number of items, the sum of the item 
variances, and the total score variance. Table 1 provides information on the number of items, k = A 
and the sum of the item variances. Using the information in Table 1, the sum of the item variances 



can be computed as: 
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=0.24 + 0.22 + 0.21 + 0.15 = 0 . 82 . 



Crocker and Algina (1986, p. 95) provide a formula to calculate the total score variance using the 
information found in Table 1: 



" + [Scov. (for i < j) X 2], 



( 3 ) 



Using the information in Table 2, the total variance can be computed using Equation 3: 

o-fotaf = z K + \HCOV,j (for i < j) X 2 ] 

= 0.82 +[ 0 x 2 ] 

= 0.82 +0 = 0.82 

Alpha can then be found using Equation 2: 

= 4/(4-l)[l- (0.82/ 0.82)] 

= 4/3[l-l] 

= 1.33[0] 

= 0 

When items are perfectly uncorrelated, the items share no variance; therefore there is no internal 
consistency between the item scores. Accordingly, alpha will equal zero when items are perfectly 



uncorrelated. 
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Table 1 

Covariance and Correlation Matrices for Scenario #1 



Var. 


Variance / Covariance 




Correlation 




1 


2 


3 4 


1 


2 


3 


4 


1 


0.24 






1.00 








2 


.00 


0.22 




.00 


1.00 






3 


.00 


.00 


0.21 


.00 


.00 


1.00 




4 


.00 


.00 


.00 0.15 


.00 


.00 


.00 


1.00 



Note. Adapted from Score reliability: Contemporary thinking on reliability issues by B. Thompson, 
2003, p. 15 and from” Understanding internal consistency reliability estimates: A conceptual primer 
on coefficient alpha,” by R. K. Henson, 2001, Measurement and Evaluation in Counseling and 
Development, 34, 183. 



Table 2 

Total Score Variance as a Function of Item Variances and Covariances for Scenario #1 



Pairs 


Covariance / Variance 




rISD 




i 


j 


cov, 


VARi 


VARj 


fij 


SDi 


SDj 


1 


2 


.00 


0.24 


0.22 


.00 


0.49 


0.47 


1 


3 


.00 


0.24 


0.21 


.00 


0.49 


0.46 


1 


4 


.00 


0.24 


0.15 


.00 


0.49 


0.39 


2 


3 


.00 


0.22 


0.21 


.00 


0.47 


0.46 


2 


4 


.00 


0.22 


0.15 


.00 


0.47 


0.39 


3 


4 


.00 


0.21 


0.15 


.00 


0.46 


0.39 



Note. Adapted from Score reliability: Contemporary thinking on reliability issues by B. Thompson, 
2003, p. 15 and from” Understanding internal consistency reliability estimates: A conceptual primer 
on coefficient alpha,” by R. K. Henson, 2001, Measurement and Evaluation in Counseling and 
Development, 34, 183. 



Scenario #2: all item score correlations are perfectly correlated. Using the information 
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in Table 3, the sum of the item variances can be computed as: 



=0.24 + 0.22 + 0.21 + 0.15 = 0 . 82 . 



Using the information in Table 4, the total variance can be computed using Equation 3: 

=Yj^I+^ COV y (for i < j) X 2 ] 

= 0.82 + [(0.23 + 0.22 + 0. 19 + 0.21 + 0.18 + 0. 18) X 2 ] 

= 0.82 + [ 1.22 X 2 ] = 3.26 

Alpha can then be found using Equation 2: 



= 4/(4-l)[l- (.82/3.26)] 
= 4/3[l-0.2515] 

= 1.33 [0.7485] 

= .9955 



When items are perfectly correlated, there is perfect internal consistency between the item scores. 
Accordingly, a = 1 (within rounding error) when items are perfectly correlated. 
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Table 3 

Covariance and Correlation Matrices for Scenario #2 



Var. 


Variance / Covariance 




Correlation 




1 


2 


3 4 


1 


2 


3 


4 


1 


0.24 






1.00 








2 


0.23 


0.22 




1.00 


1.00 






3 


0.22 


0.21 


0.21 


1.00 


1.00 


1.00 




4 


0.19 


0.18 


0.18 0.15 


1.00 


1.00 


1.00 


1.00 



Note. Adapted from Score reliability: Contemporary thinking on reliability issues by B. Thompson, 



2003, p. 16 and from” Understanding internal consistency reliability estimates: A conceptual primer 
on coefficient alpha,” by R. K. Henson, 2001, Measurement and Evaluation in Counseling and 
Development, 34, 185. 



Table 4 

Total Score Variance as a Function of Item Variances and Covariances for Scenario #2 



Pairs 


Covariance / Variance 




rISD 




i 


j 


cov, 


VARi 


VARi 


fij 


SDi 


SD, 


1 


2 


0.23 


0.24 


0.22 


1.00 


0.49 


0.47 


1 


3 


0.22 


0.24 


0.21 


1.00 


0.49 


0.46 


1 


4 


0.19 


0.24 


0.15 


1.00 


0.49 


0.39 


2 


3 


0.21 


0.22 


0.21 


1.00 


0.47 


0.46 


2 


4 


0.18 


0.22 


0.15 


1.00 


0.47 


0.39 


3 


4 


0.18 


0.21 


0.15 


1.00 


0.46 


0.39 



Note. Adapted from Score reliability: Contemporary thinking on reliability issues by B. Thompson, 
2003, p. 16 and from” Understanding internal consistency reliability estimates: A conceptual primer 
on coefficient alpha,” by R. K. Henson, 2001, Measurement and Evaluation in Counseling and 
Development, 34, 185. 



Scenario #3: all item score correlations are perfectly correlated and have mixed signs. 
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Using the information in Table 5, the sum of the item variances can be computed as: 



=0.24 + 0.22 + 0.21 + 0.15 = 0 . 82 . 



Using the information in Table 6, the total variance can be computed using Equation 3: 



+\^COV.j (for i < j) X 2 ] 

= 0.82 +[(-0.23 + -0.22 + -0.19 + 0.21 + 0.18 + 0.18)x2] 

= 0.82 +[-.07x2] 

= 0.82 +[-. 14 ] = .68 

Alpha can then be found using Equation 2: 

a = K/(K-l)[l-(XaI/^r«r..')] 

= 4/(4-l)[l- (.827.68)] 

= 4/3[l-1.2059] 

= 1.33 [-.2059] 

= -.2738 

When items are perfectly correlated and have mixed signs, the sum of item variances will be greater 
than the total score variance. When the individual score variance is greater than total score, internal 
consistency is non-existent between the item scores; therefore the items are measuring different 
concepts. In general, as items are more correlated, shared variance increases, increasing internal 



consistency; therefore increasing the magnitude of the alpha coefficient. 
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Table 5 

Covariance and Correlation Matrices for Scenario #3 



Var. 




Variance / Covariance 




Correlation 




1 


2 


3 


4 


1 


2 


3 


4 


1 


0.24 








1.00 








2 


- 0.23 


0.22 






- 1.00 


1.00 






3 


- 0.22 


0.21 


0.21 




- 1.00 


1.00 


1.00 




4 


- 0.19 


0.18 


0.18 


0.15 


- 1.00 


1.00 


1.00 


1.00 



Note. Adapted from Score reliability: Contemporary thinking on reliability issues by B. Thompson, 



2003, p. 17 and from” Understanding internal consistency reliability estimates: A conceptual primer 
on coefficient alpha,” by R. K. Henson, 2001, Measurement and Evaluation in Counseling and 
Development, 34, 185. 



Table 6 

Total Score Variance as a Function of Item Variances and Covariances for Scenario #3 



Pairs 


Covariance / Variance 




rISD 




i 


j 


cov , 


VARi 


VARi 


fij 


SDi 


SDi 


1 


2 


- 0.23 


0.24 


0.22 


- 1.00 


0.49 


0.47 


1 


3 


- 0.22 


0.24 


0.21 


- 1.00 


0.49 


0.46 


1 


4 


- 0.19 


0.24 


0.15 


- 1.00 


0.49 


0.39 


2 


3 


0.21 


0.22 


0.21 


1.00 


0.47 


0.46 


2 


4 


0.18 


0.22 


0.15 


1.00 


0.47 


0.39 


3 


4 


0.18 


0.21 


0.15 


1.00 


0.46 


0.39 



Note. Adapted from Score reliability: Contemporary thinking on reliability issues by B. Thompson, 
2003, p. 17 and from” Understanding internal consistency reliability estimates: A conceptual primer 
on coefficient alpha,” by R. K. Henson, 2001, Measurement and Evaluation in Counseling and 
Development, 34, 185. 



Discussion 



Because tests are not rehable, it is important to explore score reliability in virtually all 








CRONBACH’S ALPHA 
15 

studies. Reliability coefficients have the ability to impact how researchers interpret study results. 
Researchers should be aware of alpha’s properties to accurately gather data and interpret results. A 
better understanding of score reliability can resolve common misconceptions and employ authors 
to write and speak cautiously when referring to reliability estimates. The present paper explains 
the most frequently used reliability estimate, coefficient alpha, so that the coefficient's conceptual 
underpinnings will be understood. 
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